Updating formulas and algorithms for computing entropy and Gini index on time-changing data streams

نویسنده

  • Blaz Sovdat
چکیده

Despite growing interest in data stream mining the most successful incremental learners still use periodic recomputation to update attribute information gains and Gini indices. This note provides simple incremental formulas and algorithms for computing entropy and Gini index from time-changing data streams.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Sources of Variable Selection Bias in Classification Tree Algorithms Based on the Gini Index

Evidence for variable selection bias in classification tree algorithms based on the Gini Index is reviewed from the literature and embedded into a broader explanatory scheme: Variable selection bias in classification tree algorithms based on the Gini Index can be caused not only by the statistical effect of multiple comparisons, but also by an increasing estimation bias and variance of the spli...

متن کامل

Application of Different Methods of Decision Tree Algorithm for Mapping Rangeland Using Satellite Imagery (Case Study: Doviraj Catchment in Ilam Province)

Using satellite imagery for the study of Earth's resources is attended by manyresearchers. In fact, the various phenomena have different spectral response inelectromagnetic radiation. One major application of satellite data is the classification ofland cover. In recent years, a number of classification algorithms have been developed forclassification of remote sensing data. One of the most nota...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

Construction of Decision Tree for Insurance Policy System through Entropy and GINI Index

In the modern age of computing, sparse and irregularity in a sample of data is needed for reconstruction of the large and different types of dimensions of data. However, there is a challenge to analyze this type of data. One issue is the robust selection of data. There are many analytical tools are available but crucial part is the decomposition, and make the clusters as well as gradient estima...

متن کامل

Unifying Decision Trees Split Criteria Using Tsallis Entropy

The construction of efficient and effective decision trees remains a key topic in machine learning because of their simplicity and flexibility. A lot of heuristic algorithms have been proposed to construct near-optimal decision trees. Most of them, however, are greedy algorithms which have the drawback of obtaining only local optimums. Besides, common split criteria, e.g. Shannon entropy, Gain ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1403.6348  شماره 

صفحات  -

تاریخ انتشار 2014